Back

Journal of Speech, Language, and Hearing Research

American Speech Language Hearing Association

All preprints, ranked by how well they match Journal of Speech, Language, and Hearing Research's content profile, based on 10 papers previously published here. The average preprint has a 0.00% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Cross-Linguistic Analysis of Speech Markers: Insights from English, Chinese, and Italian Speakers

Santi, G. C.; Catricala, E.; Kwan, S.; Wong, A.; Ezzes, Z.; Wauters, L.; Esposito, V.; Conca, F.; Gibbons, D.; Fernandez, E.; Santos-Santos, M. A.; Chen, T.-F.; Kwan-Chen, L. L.-Y.; Lo, R. R.; Tsoh, J.; Lung-Tat Chen, A.; Garcia, A. M.; de Leon, J.; Miller, Z.; Vonk, J. M. J.; Bruffaerts, R.; Grasso, S. M.; Allen, I. E.; Cappa, S. F.; Gorno-Tempini, M.-L.; Tee, B. L.

2024-10-16 neurology 10.1101/2024.10.15.24314191 medRxiv
Top 0.1%
12.6%
Show abstract

Cross-linguistic studies with healthy individuals are vital, as they can reveal typologically common and different patterns while providing tailored benchmarks for patient studies. Nevertheless, cross-linguistic differences in narrative speech production, particularly among speakers of languages belonging to distinct language families, have been inadequately investigated. Using a picture description task, we analyze cross-linguistic variations in connected speech production across three linguistically diverse groups of cognitively normal participants--English, Chinese (Mandarin and Cantonese), and Italian speakers. We extracted 28 linguistic features, encompassing phonological, lexico-semantic, morpho-syntactic, and discourse/pragmatic domains. We utilized a semi-automated approach with Computerized Language ANalysis (CLAN) to compare the frequency of production of various linguistic features across the three language groups. Our findings revealed distinct proportional differences in linguistic feature usage among English, Chinese, and Italian speakers. Specifically, we found a reduced production of prepositions, conjunctions, and pronouns, and increased adverb use in the Chinese-speakers compared to the other two languages. Furthermore, English participants produced a higher proportion of prepositions, while Italian speakers produced significantly more conjunctions and empty pauses than the other groups. These findings demonstrate that the frequency of specific linguistic phenomena varies across languages, even when using the same harmonized task. This underscores the critical need to develop linguistically tailored language assessment tools and to identify speech markers that are appropriate for aphasia patients across different languages.

2
Phonemic awareness deficits in an alphasyllabary language: Effects of task type and linguistic complexity in children with Specific Learning Disorder-Reading

Soman, A.; Dev, S. S.; Ravindren, R.

2026-04-07 psychiatry and clinical psychology 10.64898/2026.04.02.26349894 medRxiv
Top 0.1%
12.4%
Show abstract

Background Phonemic awareness deficits are a core feature of Specific Learning Disorder-Reading (SLD-R). How task- and language-specific factors influence these deficits in alphasyllabary languages may help clarify the cognitive mechanisms underlying reading impairment in SLD-R. Methods Thirty children with a DSM-5 diagnosis of SLD-R (mean age 11.4 years) and 29 age-matched typically developing children were given phoneme blending (words and pseudowords) and segmentation tasks in Malayalam. The effects of age and consonant clusters on task performance were evaluated. Results Children with SLD-R performed significantly worse than controls across most phonemic awareness tasks, with the largest deficits observed in pseudoword blending and word blending, and smaller deficits in segmentation. No significant difference was observed for initial phoneme deletion. In typically developing children, age showed strong positive correlations with phonemic performance across most tasks, whereas the SLD-R group showed weak or absent correlations, except in word blending and initial phoneme deletion. Consonant clusters significantly affected performance in both groups, with SLD-R showing more severe deficits. Conclusions Phonemic awareness deficits observed in SLD-R in alphasyllabary languages like Malayalam are more prominent in tasks where lexical support is absent, like pseudoword blending. These deficits vary across task types and linguistic complexity. Phonemic awareness improves with age in typically developing children, while improvement is uneven in children with SLD-R. The findings suggest that phonemic awareness deficits are a core feature of SLD-R across languages, but their manifestation is shaped by orthographic and linguistic characteristics of the writing system.

3
Automated Phonological Error Scoring for Children with Language and Hearing Impairment

Sundstrom, S.; Themistocleous, C.

2024-09-04 neurology 10.1101/2024.09.04.24313011 medRxiv
Top 0.1%
10.3%
Show abstract

PurposePhonological production impairments are prevalent in children with developmental language disorder (DLD) and hearing impairment (HI). This study aims to quantify and compare phonological errors in Swedish-speaking children using a novel automated assessment tool and provide an automatic machine learning classification algorithm of children with DLD and HI to age-matched controls based on phonological errors. Methods72 Swedish-speaking children (29 with DLD, 14 with HI, and 29 typically developing) participated. Phonological production was elicited using a 72-item confrontation naming task. A novel tool was developed to calculate a composite phonological error score and specific scores for different phonological errors (deletions, insertions, substitutions, and transpositions) from written speech productions. This tool leverages the International Phonetic Alphabet (IPA) and a form of the Normalized Damerau-Levenshtein Distance for accurate error analysis. ResultsThe composite score successfully differentiated between children with DLD and typically developing children, highlighting its sensitivity in detecting phonological impairment. Machine learning models can accurately differentiate between children with and without language disorders. However, children with DLD and HI differed in the phonemic deletion errors, which suggests that their phonemic production is relatively similar. ConclusionsChildren with DLD and HI exhibit significantly higher phonological error rates compared to typically developing peers. Children with HI and DLC are comparably impaired in phonology (as manifested by the compositive phonological score). These findings highlight the potential of machine learning for early identification and targeted intervention in language disorders, improving outcomes for affected children and demonstrated the potential of a multilingual tool for scoring phonological errors.

4
EEG responses to auditory cues predict fluency variability and stuttering intervention outcome

Rocha, M. F.; Carmona, J.; Correia, J. M.

2025-02-24 neuroscience 10.1101/2025.02.21.635719 medRxiv
Top 0.1%
6.9%
Show abstract

Stuttering is a variable speech disorder whose brain mechanisms remain unknown. Sensorimotor brain circuits, critical for motor-speech control, including auditory processing necessary for speech prediction and monitoring, have been linked to the disorder. Despite considerable advances, it remains unclear whether auditory circuits relate to stuttering variability, and whether the panoply of interventions for persons who stutter can lead to brain changes within these circuits. We employed electroencephalography (EEG), in a group of persons who stutter, in combination with auditory probes to tap onto the importance of auditory cortical regions in stuttering variability. Participantsproduced flexible speech (i.e., describing visual scenes) and non-flexible speech (i.e., reading syllables), following an auditory cue. More pronounced P200 auditory evoked potentials were observed in participants with higher dysfluency rates, mainly in the spontaneous speech task. Interestingly, speech therapy intervention led to a reduction of the P200 potential, which was in turn significantly related to fluency improvements. Furthermore, EEG response patterns discriminative of cue frequency (400 or 800 Hz tones) were also predictive of dysfluency scores. Our study highlights the involvement of auditory cortical processing and that of auditory attention in stuttering variability. We support that a higher state of auditory alertness may be implicated in the sensorimotor mechanisms of stuttering, and that speech therapy interventions promoting more self-confident communication can restraint auditory alertness, and potentially reduce speech dysfluencies. HighlightsO_LIAuditory probes can assess the auditory cortex in speech production and stuttering. C_LIO_LIStuttering severity correlates to EEG auditory responses during speech preparation. C_LIO_LIHigher states of auditory alertness in stuttering may be reduced by speech therapy. C_LI Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=123 SRC="FIGDIR/small/635719v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@18a639dorg.highwire.dtl.DTLVardef@91efc6org.highwire.dtl.DTLVardef@114dd97org.highwire.dtl.DTLVardef@e01024_HPS_FORMAT_FIGEXP M_FIG C_FIG Speaking requires orchestrating several brain processes at a time. The auditory system assumes a central role, not only in waiting for the right moment to initiate speech, listening to self-produced speech, predicting the consequence of future speech, but also adjusting these processes to the intermittent nature of stuttering.

5
Automated transcription in primary progressive aphasia: Accuracy and effects on classification

Clarke, N.; Morin, B.; Bedetti, C.; Bogley, R.; Pellerin, S.; Houze, B.; Ramkrishnan, S.; Ezzes, Z.; Miller, Z.; Gorno Tempini, M. L.; Vonk, J. M. J.; Brambati, S. M.

2026-02-26 neurology 10.64898/2026.02.24.26346981 medRxiv
Top 0.1%
6.5%
Show abstract

INTRODUCTIONConnected speech analyses can help characterize linguistic impairments in primary progressive aphasia (PPA) and classify variants, however, manual transcription of speech samples is time-consuming and expensive. Automated speech recognition (ASR) may be efficacious for transcribing PPA speech. METHODSTranscripts of picture descriptions (109 PPA, 32 healthy controls (HC)) were generated using a manual, automated (Whisper) or semi-automated approach including a quality control (QC) step. We evaluated transcript accuracy, the reliability of ASR-derived linguistic features, and classification performance. RESULTSWhisper demonstrated lowest error rates for HC, followed by semantic, logopenic and non-fluent PPA variants. Errors correlated with overall disease severity for semantic and logopenic variants. QC of Whisper outputs reduced errors and improved the reliability of linguistic features. Overall, ASR-derived features achieved better classification performance than manual transcription features. DISCUSSIONResults support the use of off-the-shelf ASR for scalable, cost-efficient transcription of PPA speech and classification.

6
The impact of cognitive ability on multitalker speech perception in neurodivergent individuals

Lau, B. K.; Emmons, K.; Maddox, R. K.; Estes, A.; Dager, S.; (Astley) Hemingway, S.; Lee, A. K.

2022-09-20 psychiatry and clinical psychology 10.1101/2022.09.19.22280007 medRxiv
Top 0.1%
6.3%
Show abstract

The ability to selectively attend to one talker in the presence of competing talkers is crucial to communication. Here we investigate whether cognitive deficits in the absences of hearing loss can impair speech perception. We tested typical hearing, neurodivergent adolescents/adults with autism spectrum disorder, fetal alcohol spectrum disorder, and an age- and sex-matched neurotypical group. We found a strong correlation between IQ and speech perception, with individuals with lower IQ scores having worse speech thresholds. These results demonstrate that deficits in cognitive ability, despite intact peripheral encoding, can impair listening under complex conditions. These findings have important implications for conceptual models of speech perception and for audiological services to improve communication in real-world environments for neurodivergent individuals.

7
AI-based Speech Error Detection to Differentiate Primary Progressive Aphasia Variants

Vonk, J. M. J.; Lian, J.; Cho, C. J.; Antonicelli, G.; Ezzes, Z.; Wauters, L. D.; Keegan-Rodewald, W.; Kurteff, G. L.; Rodriguez, D. A.; Dronkers, N.; Henry, M. L.; Miller, Z. A.; Mandelli, M. L.; Anumanchipalli, G. K.; Gorno-Tempini, M. L.

2026-02-24 neurology 10.64898/2026.02.23.26346899 medRxiv
Top 0.1%
4.9%
Show abstract

BackgroundArtificial Intelligence (AI) based approaches to speech analysis have the potential to assist with objective speech error analysis in aphasia but off-the shelf tools often fail to detect speech errors due to prioritizing "fluent transcription." Speech production errors (dysfluencies) are hallmark diagnostic features of the nonfluent (nfvPPA) and logopenic (lvPPA) variants of primary progressive aphasia, yet they can be challenging to detect and characterize even by expert clinicians. This study aimed to evaluate whether the novel automated lightweight Scalable Speech Dysfluency Modeling system (SSDM-L), specifically designed to detect dysfluencies, could accurately distinguish PPA variants using voice recordings of individuals reading a brief passage. MethodParticipants included a total of 104 individuals, 40 with nfvPPA, 40 with lvPPA (matched on disease severity), and 24 healthy controls who read aloud the Grandfather Passage as part of a widely used motor speech evaluation (MSE). We automatically extracted ten speech error (dysfluency) variables using SSDM-L, including insertions, replacements, and deletions at both phoneme- and word-levels, and phoneme-level prolongations and repetitions. Group differences were assessed via ANCOVAs controlling for age, education, and disease severity (MMSE, CDR sum-of-boxes). To test clinical relevance, we performed correlation analyses with MSE ratings provided by experienced speech-language pathologists (i.e., gold standard) within the nfvPPA group. Classification performance was assessed by training random forest and XGBoost machine-learning models including 5-fold cross-validation. ResultsAll individuals read the entire passage in less than five minutes. SSDM-L detected eight of the ten predefined dysfluency features at sufficient frequency to include them in subsequent analyses. All eight features distinguished PPA from controls (p<.006). Individuals with nfvPPA made more errors than the lvPPA group on every feature (all p<.023). Each feature showed a moderate positive correlation with a global MSE apraxia/dysarthria score (r=.31-.56; p<.001-.053). Together, the eight features were able to classify nfvPPA versus lvPPA at AUC=.806 (random forest) and AUC=.776 (XGBoost). DiscussionAI-based automated speech error analysis accurately distinguished nfvPPA and lvPPA variants using a brief reading task. This quick error-sensitive scalable AI system has the potential of providing a practical tool to aid diagnosis in aphasia and motor speech disorders.

8
Comparing Narrative Storytelling Ability in Individuals with Autism Spectrum Disorder and Fetal Alcohol Spectrum Disorder

Pham, L.; Lee, A. K.; Estes, A.; Dager, S.; (Astley) Hemingway, S.; Thorne, J.; Lau, B. K.

2022-09-20 psychiatry and clinical psychology 10.1101/2022.09.20.22280005 medRxiv
Top 0.1%
4.8%
Show abstract

PurposeNarrative discourse, or storytelling, is used in daily conversational interaction and reveals higher level language skills that may not be well captured by standardized assessments of language. Many individuals with autism spectrum disorder (ASD) and fetal alcohol spectrum disorder (FASD) show difficulty with pragmatic language skills and narrative production offers one method of assessing expressive and pragmatic language skills in an ecologically relevant manner. This study investigated narrative abilities on both local and global levels of adolescent/young adults with ASD and FASD, and their age- and sex-matched comparison group. MethodNarratives from forty-five adolescents/young adults, 11 with ASD, 11 with FASD, 23 age- and sex-matched neurotypical comparison group, were elicited using a wordless storybook. They were then transcribed orthographically, formatted to the Systematic Analyses of Language Transcript (SALT) convention, and scored based on the narrative scoring scheme (NSS). Additional analyses investigated local language measures such as the number of mental state and temporal relation terms produced, as well as global language measures including the mean length of utterance, total number of different words, total number of words, total number of utterances, rate of speech, and the narrative scoring scheme total score. ResultsOn local language measures, no significant group differences were found. On global language measures, many aspects of narrative production in the ASD and FASD groups were comparable to each other and to the comparison group, although important differences were observed for the total number of words produced and rate of speech. ConclusionsGiven significant variability observed within groups, these findings suggest that language abilities should be assessed at an individual level. Future research should also consider additional variables that influence narrative production such as motivation, distractibility, or decision-making of individual subjects.

9
Iceberg or cut off - how adults who stutter articulate fluent-sounding utterances

Leha, A.; Dickhut, S.; Primassin, A.; Korzeczek, A.; Joseph, A. A.; Paulus, W.; Frahm, J.; Sommer, M.

2020-04-17 neuroscience 10.1101/2020.04.15.042432 medRxiv
Top 0.1%
4.8%
Show abstract

Whether fluent-sounding utterances of adults who stutter (AWS) are normally articulated is unclear. We asked 15 AWS and 17 matched adults who do not stutter (ANS) to utter the pseudoword "natscheitideut" 15 times in a 3 T MRI scanner while recording real-time MRI videos at 55 frames per per second in a mid-sagittal plane. All stuttered or otherwise dysfluent runs were discarded. We used sophisticated analyses to model the movement of the tip of the tongue, lips and velum. We observed reproducible movement patterns of the inner and outer articulators which were similar in both groups. Speech duration was similar in both groups and decreased over repetitions, more so in ANS than in AWS. The variability of the movement patterns of tongue, lips and velum decreased over repetitions. The extent of variability decrease was similar in both groups. Across all participants, this repetition effect on movement variability for the lips and the tip of the tongue was less pronounced in severely as compared to mildly stuttering individuals. We conclude that there is no major difference in the movement patterns of a fluent-sounding utterance in both groups. This encourages studies looking at state rather than trait markers of speech dysfluency.

10
Automated analysis of written language in the three variants of primary progressive aphasia

Josephy-Hernandez, S.; Rezaii, N.; Jones, A.; Loyer, E.; Hochberg, D.; Quimby, M.; Wong, B.; Dickerson, B. C.

2022-07-25 neurology 10.1101/2022.07.24.22277977 medRxiv
Top 0.1%
4.4%
Show abstract

Despite the important role of written language in everyday life, abnormalities in functional written communication have been sparsely investigated in Primary Progressive Aphasia (PPA). Prior studies have analyzed written language separately in the three variants of PPA - nonfluent (nfvPPA), logopenic (lvPPA), and semantic (svPPA) - but have rarely compared them to each other or to spoken language. Manual analysis of written language can be a time-consuming process. We developed a program which uses a language parser and quantifies content units (CU) and total units (U) in written language samples. The program was used to analyze written and spoken descriptions of the WAB Picnic scene, based on a pre-defined CU corpus. We then calculated the ratio of CU to U (CU/U Ratio) as a measure of content density. Our cohort included 115 participants (20 control participants for written, 20 control participants for spoken, 28 participants with nfvPPA, 30 with lvPPA, and 17 with svPPA). We compared written language between patients with PPA and control participants and written to spoken language in patients with the three variants of PPA. Finally, we analyzed CU and U in relation to the Progressive Aphasia Severity Scale Sum of Boxes and the Clinical Dementia Rating Sum of Boxes. Our program identified CU with a validity of 99.7% (95%CI 99.5 to 99.8) compared to manual annotation of the samples. All patients with PPA wrote fewer total units than controls (p<0.001). Patients with lvPPA (p=0.013) and svPPA (0.004) wrote fewer CU than controls. The CU/U Ratio was higher in nfvPPA and svPPA than controls (p=0.019 in both cases), but no different between lvPPA patients and controls (p=0.962). Participants with lvPPA (p<0.001) and svPPA (p=0.04) produced fewer CU in written samples compared to spoken. A two-way ANOVA showed all groups produced fewer units in written samples compared to spoken (p<0.001). However, the decrease in written CU compared to spoken was smaller than the decrease in written units compared to spoken in participants with PPA, resulting in a larger written CU/U Ratio when compared to spoken language (p<0.001). nfvPPA patients produced correlated written and spoken CU (R=0.5, p=0.009) and total units (R=0.64, p<0.001), but this was not the case for lvPPA or svPPA. Considering all PPA patients, fewer CU were produced in those with greater aphasia severity (PASS SoB, R=-0.24, p=0.04) and dementia severity (CDR SoB, R=-0.34, p=0.004). In conclusion, we observed reduced written content in patients with PPA compared to controls, with a preference for content over non-content units in patients with nfvPPA and svPPA. When comparing written to spoken language, we observed a similar "telegraphic" style in both modalities in patients with nfvPPA, which was different from patients with svPPA and lvPPA, who use significantly less non-content units in writing than in speech. Lastly, we show how our program provides a time-efficient tool, which could enable feedback and tracking of writing as an important feature of language and cognition.

11
Speech-Based Markers in Paediatric ADHD: A Longitudinal Case-Control Study of Voice Features and Medication Effects

Bamberger, R.; Kuhles, G.; Lotter, L. D.; Dukart, J.; Konrad, K.; Guenther, T.; Siniatchkin, M.; Fuchs, M.; von Polier, G.

2026-03-31 psychiatry and clinical psychology 10.64898/2026.03.25.26348708 medRxiv
Top 0.1%
4.3%
Show abstract

Background Diagnosis and treatment monitoring of attention-deficit/hyperactivity disorder (ADHD) largely rely on subjective assessments, highlighting the need for objective markers. Voice features and speech embeddings represent promising candidates for such markers, as they may capture alterations in speech production relevant to ADHD. However, it remains unclear which speech features are most informative for distinguishing ADHD and monitoring treatment effects, and which speech tasks most reliably elicit such differences. Methods Twenty-seven children with ADHD and 27 age-matched neurotypical controls completed six speech tasks across two study visits. Children with ADHD were unmedicated at baseline (first visit) and were assessed under prescribed methylphenidate treatment at follow-up, whereas controls underwent repeated assessment without intervention. Established acoustic voice features (eGeMAPS) and high-dimensional speech embeddings (WavLm, Whisper) were extracted and analysed using linear mixed models to examine baseline group differences and group-by-time interaction effects reflecting medication-associated change patterns. Results At baseline, children with ADHD differed significantly from controls in frequency, spectral, and temporal voice features, characterized by lower and more variable pitch, altered spectral properties, and reduced rhythmic stability. Group-by-time interaction effects indicated medication-associated modulation in the ADHD group, including reduced loudness variability and increased precision of vowel articulation at follow-up, changes not observed in controls. Speech embeddings revealed additional baseline and interaction effects beyond established acoustic features. Free speech tasks, particularly picture description, yielded the most robust and consistent effects. Conclusion Children with ADHD differed from neurotypical controls in vocal features at baseline and showed distinct longitudinal change patterns consistent with medication-related change. These findings support further investigation of speech-based measures as candidate digital phenotypes and potential digital biomarkers in ADHD, with picture description emerging as a particularly promising task for future clinical assessment protocols.

12
Consistency of Linguistic and Cognitive Processing Measures to Discriminate Children with and without Developmental Language Disorder (DLD): Comparing Likelihood Ratios (LHs) and Elastic Net Regression Computational Models.

Sharma, S.; Golden, R. M.; Montgomery, J. W.; Gillam, R. B.; Evans, J.

2026-03-09 psychiatry and clinical psychology 10.64898/2026.03.09.26347082 medRxiv
Top 0.1%
4.2%
Show abstract

Because both monothetic and polythetic diagnostic classification approaches focus on the presence of individual symptom(s) to identify individuals in a clinical population, they may be diagnostically sensitive clinical markers of multidimensional disorders such as developmental language disorder (DLD). DLD researchers have also used likelihood ratios (LHs) to identify possible diagnostic clinical markers of DLD, however the diagnostic sensitivity of LHs varies markedly across studies. A recent multidimensional computational elastic-net regression examined a total of 71 measures of spoken language and cognitive processing from a cohort of 223 children ages 7;0 to 11;0 with and without DLD (DLD = 110; typically developing (TD) controls = 113). All 200 iterations of the model had high discriminative power (87% - 88%) in positively identifying and distinguishing the DLD participants across all thresholds. Notably, the models identified a sparse DLD-specific deficit profile which only included nine of the 71 measures. In this study, we ask if the individual LHs for each of these nine measures are equally sensitive in identifying and discriminating the children with DLD from TD controls or if diagnostic markers of multidimensional disorders such as DLD can only be identified based on computational modeling approaches. The LHs for each of the nine measures were in the moderately high ranged (3.25 - 10). However, at the the highest LH cut points for each measure, there was little to no overlap in the children each measure identified as having DLD. Follow up analysis revealed that the elastic net model-derived predictive scores for each participant were significantly correlated with the participants language ability. The model also identified a subgroup of TD participants as having the same DLD-deficit profile as the DLD participants. This subgroup were younger, predominantly male participants whose standardized language assessment scores were lower as compared to the larger TD cohort. Taken together, the results from this study show that, because multidimensional modeling approaches such as elastic net regression leverage the variability in the deficit profiles across individual members of a diagnostic group and the unique contributions of each of the behavioral features of the phenotype, they may be an effective tool in deriving diagnostically specific deficit profiles for phenotypically complex, multicausal, multidimensional, neurodevelopmental disorders such as DLD. The results also demonstrate the robustness of the derived DLD-specific deficit profile in identifying individuals with "mild" or subclinical DLD, demonstrating the potential utility of this approach in both clinical and research arenas. What this paper adds.O_ST_ABSWhat is already known on this subject.C_ST_ABSThe identification of diagnostic markers for DLD has been a challenge for both clinicians and researchers across multiple decades. Monothetic classification markers such as non-word repetition, optional infinitive, or syntax dependencies have been explored, as well as polythetic classification approaches where a list of diagnostic symptoms is used together. However, each assumes different criteria and symptoms that should be included as diagnostic markers of DLD. What this study adds.Our study assessed the feasibility and effectiveness of monothetic vs. polythetic classification approaches for identifying DLD. Since our prior work, which used elastic net logistic regression computational modeling with strong discriminatory power, consistently selected nine key features as the DLD-deficit profile, in this effort, we calculated each of the nine features likelihood ratios to examine each measures ability to identify children with DLD. The monothetic approach failed to identify a consistent set of children with DLD, and the polythetic classification approach also did not identify participants who were shown to have mild DLD by the elastic net modeling approach. Instead, our analysis showed that a computational modeling approach, such as elastic net regression, that included small but important input from multiple cognitive and linguistic aspects of children, could better capture multifaceted information about the disorder, better account for individual variability, and consistently identify most participants with DLD. Clinical implications of this study.Elastic net logistic regression identifies a small subset of important features for distinguishing DLD and can assign a probability of DLD presence for each participant. Instead of the polythetic and monothetic approaches commonly used in the field, our study shows that integrating advanced computational modeling, such as elastic net regression, with clinician judgment can better refine assessment processes and address prior and ongoing inconsistencies in the DLD literature and diagnostic practices.

13
Differences in brain activity during sentence repetition in people who stutter: a combined analysis of four fMRI studies

Demirel, B.; Chesters, J.; Connally, E.; Gough, P.; Ward, D.; Howell, P.; Watkins, K. E.

2025-07-12 neuroscience 10.1101/2025.07.09.663990 medRxiv
Top 0.1%
3.9%
Show abstract

Our understanding of the neural correlates of developmental stuttering benefits from the use of functional MRI (fMRI) during speech production. Despite two decades of research, however, we have reached little consensus. In the current study, we analysed pooled fMRI data from four different studies that used the same sentence reading task and methodological approach. The combined sample included 56 adolescents and adults who stutter and 53 demographically matched typically fluent controls. A sparse-sampling design was used in each study, in which participants spoke during the silent period between measurements of brain activity. Sentence reading evoked activity in both groups across frontal and temporal regions bilaterally. At statistical thresholds corrected for family-wise error, there were no significant group differences. An uncorrected threshold was applied to explore group differences in areas previously identified in earlier fMRI studies on stuttering. People who stutter (PWS) showed greater activity compared with controls in right frontal pole, right anterior insula extending to frontal operculum, left planum temporale, and midbrain, at the level of red nucleus. In contrast, PWS showed lower activity in left superior frontal sulcus, subgenual medial prefrontal cortex, right anterior temporal lobe, and portions of inferior parietal lobe bilaterally including the angular gyrus on the left. Despite pooling data across multiple studies to achieve a relatively large sample, group differences in regions involved in speech-motor control only emerged at an uncorrected voxel-wise threshold. Some of these findings align with previous fMRI studies, such as increased activity in the right anterior insular cortex.

14
Impaired temporal prediction mechanisms in dyslexia

Bonnet, P. A.; Tillmann, B.; Chettih, E.; Bedoin, N.; Kosem, A.

2026-01-17 neuroscience 10.64898/2026.01.16.699956 medRxiv
Top 0.1%
3.6%
Show abstract

Effective speech analysis involves deconstructing the acoustic signal into identifiable linguistic units, which depends on the ability to recognize and anticipate temporal patterns within the speech stream. However, these processes may be less efficient in individuals with dyslexia. This study investigated the effects of temporal context and related temporal predictions in dyslexic adult participants and matched control participants, using an auditory oddball task with non-verbal stimuli. Pure tones were presented in sequences, and participants were requested to discriminate the pitch of target stimuli. The temporal intervals between the sounds varied in regularity across the sequences, thereby creating contexts with different levels of temporal predictability. At the end of each sequence, participants were prompted to evaluate the perceived rhythmicity of the sequence and to assess their own performance in the auditory discrimination task. Dyslexic participants demonstrated overall lower accuracy in discriminating target sounds than controls. They also showed reduced influence of the temporal context of the sequences on response times, while controls responded faster in sequences that were temporally more regular and predictable. Additionally, individuals with dyslexia perceived the rhythmicity of sound sequences less accurately, overestimating the temporal regularity in irregular sequences and underestimating it in regular sequences. They also reported lower overall confidence in their ability to perform the task compared to control participants. Altogether, these findings provide converging evidence for altered temporal prediction abilities in dyslexia, which may impact auditory perception and then impair language processing.

15
The Impact of Instructions on Individual Prioritization Strategies in a Dual-Task Paradigm for Listening Effort

Kestens, K.; Lepla, E.; Vandoorne, F.; Ceuleers, D.; Van Goylen, L.; Keppler, H.

2024-06-26 otolaryngology 10.1101/2024.06.26.24309528 medRxiv
Top 0.1%
3.6%
Show abstract

IntroductionThis study examined the impact of instructions on the prioritization strategy employed by individuals during a listening effort dual-task paradigm. MethodsThe dual-task paradigm consisted of a primary speech understanding task in different listening conditions and a secondary visual memory task, both performed separately (baseline) and simultaneously (dual-task). Twenty-three normal-hearing participants (mean age: 36.8 years; 14 females) were directed to prioritize the primary speech understanding task in the dual-task condition, whereas another twenty-three (matched for age, gender, and education level) received no specific instructions regarding task priority. Both groups performed the dual-task paradigm twice (mean interval: 14.8 days). Patterns of dual-task interference were assessed by plotting the dual-task effect of the primary and secondary task against each other. Fishers exact tests were used to assess whether there was an association between interference patterns and group (non-prioritizing and prioritizing) across all listening conditions and test sessions. ResultsNo statistically significant association was found between the pattern of dual-task interference and the group to which the participants belong for any of the listening conditions and test sessions. Descriptive analysis revealed no consistent strategy use within individuals across listening conditions and test sessions, suggesting a lack of a uniform approach regardless of the given instructions. ConclusionProviding prioritization instructions was insufficient to ensure that an individual will mainly focus on the primary task and consistently adhere to this strategy across listening conditions and test sessions. These results raised reservations about the current usage of dual-task paradigms for listening effort.

16
The impact of speaker accent on discourse processing: a frequency investigation

Thomas, T.; Martin, C. D.; Caffarra, S.

2023-12-19 neuroscience 10.1101/2023.12.19.571836 medRxiv
Top 0.1%
3.6%
Show abstract

Previous studies show that there are differences in native and foreign speech processing (Lev-Ari, 2018) while mixed evidence has been found regarding differences between dialectal and foreign accent processing (see: Adank et al., 2009; Floccia et al. 2006 but see also: Floccia et al., 2009; Girard et al., 2008). Within this field, two theories have been proposed. The Perceptual Distance Hypothesis states that the mechanisms underlying dialectal accent processing are attenuated versions of those of foreign (Clarke & Garrett, 2004). While, the Different Processes Hypothesis argues that the mechanisms of foreign and dialectal accent processing are qualitatively different (Floccia et al, 2009). A recent study looking at single-word EEG data, suggested that there may be flexibility in processing mechanisms (Thomas et al., 2022). The present study deepens this investigation by addressing in which frequency bands native, dialectal and foreign accent processing differ when listening to extended speech. Electroencephalographic data was recorded from 30 participants who listened to dialogues of approximately six minutes spoken in native, dialectal and foreign accents. Power spectral density estimation (1-35 hz) was performed. Linear mixed models were done in frequency windows of particular relevance to discourse processing. Frequency bands associated with phoneme [gamma], syllable [theta], and prosody [delta] were considered along with those of general cognitive mechanisms [alpha and beta]. Results show power differences in the Gamma frequency range. While in higher frequency ranges foreign accent processing is differentiated from power amplitudes of native and dialectal accent processing, in low frequencies we do not see any accent-related power amplitude modulations. This suggests that there may be a difference in phoneme processing for native accent types and foreign accent, while we speculate that top-down mechanisms during discourse processing may mitigate the effects observed with short units of speech.

17
A Sensory-Cognitive Dissociation in Listeners with Hearing Difficulties: An Exploratory Analysis Linking Tinnitus to Binaural Unmasking Deficits and Speech Complaints to Memory

Bleeck, S.; Hamza, Y.

2025-12-19 otolaryngology 10.64898/2025.12.18.25342552 medRxiv
Top 0.1%
3.5%
Show abstract

BackgroundThe construct of Hidden Hearing Loss (HHL) proposes a link between patient-reported hearing difficulties and underlying neural deficits not captured by the standard audiogram. However, the heterogeneity of this population challenges the utility of HHL as a unitary diagnosis. This study presents an exploratory analysis aimed at deconstructing the HHL symptom complex. MethodsIn 30 participants with a range of hearing abilities and complaints, we measured binaural unmasking using the Binaural Intelligibility Level Difference (BILD). We employed a two-stage analysis. First, a "lumping" analysis tested whether participants could be grouped into a unitary "HHL profile" that predicted a BILD deficit, using both theory-driven classification and data-driven clustering. Second, after this approach failed, a pre-planned exploratory "splitting" analysis used a Linear Mixed-Effects Model (LMM) to investigate whether individual clinical markers (tinnitus, self-reported speech difficulty) were independently associated with the BILD. ResultsThe "lumping" analyses failed to find a significant difference in the BILD between subgroups, questioning the utility of a unitary HHL profile. In contrast, the exploratory "splitting" analysis found a significant interaction between tinnitus and listening condition ({beta} = 1.57, p = 0.009), suggesting that participants with tinnitus exhibited a smaller BILD. The complaint of speech perception difficulty was not significantly associated with a BILD deficit (p = 0.086) but was associated with lower scores on a test of short-term memory (forward digit span, p = 0.046). ConclusionOur findings challenge the value of a unitary HHL profile for predicting this specific binaural deficit. Instead, our exploratory analysis generated a specific, testable hypothesis of a sensory-cognitive dissociation: in our sample, tinnitus was associated with a reduced capacity for binaural unmasking, while the complaint of speech difficulty was associated with poorer short-term memory. These preliminary findings, derived from post-hoc analysis of an underpowered study, require rigorous validation in larger, pre-registered studies.

18
A computational approach for measuring sentence information via surprisal: theoretical implications in nonfluent primary progressive aphasia

Rezaii, N.; Michaelov, J.; Josephy-Hernandez, S.; Ren, B.; Hochberg, D.; Quimby, M.; Dickerson, B. C.

2022-11-29 neurology 10.1101/2022.11.25.22282630 medRxiv
Top 0.1%
3.5%
Show abstract

Nonfluent aphasia is a language disorder characterized by simplified sentence structures as well as word-level abnormalities such as a reduced use of verbs and function words. According to the predominant account of the disorder, both structural and word-level features are caused by a core deficit in the processing of syntax. Under this account, however, it remains unclear why nonfluent patients choose semantically richer verbs and may have an intact comprehension of verbs and function words. Here, we propose and test the hypothesis that the word-level features of nonfluency reflect a process that selects lexically richer words to increase the information content of sentences. We use a computational linguistic method to measure the information content of sentences in the language of patients with nonfluent primary progressive aphasia (nfvPPA) (n = 36) and healthy controls (n = 133). We measure sentence information using surprisal, a metric calculated by the average probability of occurrence of words in a sentence given their preceding context. We found that by packaging their structurally simple sentences with lower frequency words, nfvPPA patients produce sentences with similar surprisal as that of healthy speakers. Furthermore, we found that higher sentence surprisal in nfvPPA correlates with a lower function-to-all-word ratio, a lower verb-to-noun ratio, and a higher heavy-to-all-verb ratio. Surprisal is an effective quantitative index of sentence information. Using surprisal allows for testing an account of nonfluent aphasia that regards word-level features of nonfluency as adaptive rather than defective symptoms, a finding that may entail revisions in therapeutic approaches to nonfluent speech.

19
A standardised test to evaluate audio-visual speech intelligibility in French

Le Rhun, L.; Llorach, G.; Delmas, T.; Suied, C.; Arnal, L.; Lazard, D.

2023-01-18 otolaryngology 10.1101/2023.01.18.23284110 medRxiv
Top 0.1%
3.1%
Show abstract

ObjectiveLipreading, which plays a major role in the communication of the hearing impaired, lacked a French standardised tool. Our aim was to create and validate an audio-visual (AV) version of the French Matrix Sentence Test (FrMST). DesignVideo recordings were created by dubbing the existing audio files. SampleThirty-five young, normal-hearing participants were tested in auditory and visual modalities alone (Ao, Vo) and in AV conditions, in quiet, noise, and open and closed-set response formats. ResultsLipreading ability (Vo) varied from 1% to 77%-word comprehension. The absolute AV benefit was 9.25[L]dB SPL in quiet and 4.6[L]dB SNR in noise. The response format did not influence the results in the AV noise condition, except during the training phase. Lipreading ability and AV benefit were significantly correlated. ConclusionsThe French video material achieved similar AV benefits as those described in the literature for AV MST in other languages. For clinical purposes, we suggest targeting SRT80 to avoid ceiling effects, and performing two training lists in the AV condition in noise, followed by one AV list in noise, one Ao list in noise and one Vo list, in a randomised order, in open or close set-format.

20
Psychiatric Voice Biomarkers: Methodological flaws in pediatric populations

Hamoudi, H. J. A. S.; Wu, M.-J.; Sanches, M.; Soutullo, C. A.; Olmos, C.; Taylor, L. K.; Zunta-Soares, G.; Soares, J. C.; Mwangi, B.

2025-10-15 psychiatry and clinical psychology 10.1101/2025.10.13.25337901 medRxiv
Top 0.1%
3.1%
Show abstract

IntroductionPsychiatric assessments rely on patient self-reports, clinician observations, and standardized scales, while objective technological tools are currently not reliable enough to be utilized in a clinical setting. Voice may be utilized as a biomarker in different scenarios, including differential diagnosis, assessing symptom severity and predicting suicidality. However, its use depends on accurate automatic speech recognition (ASR). Current gold standard open source ASR systems are trained mainly on adult speech and perform poorly in children, limiting application in pediatric psychiatry. MethodsWe benchmarked two open-source ASR models--NVIDIA Parakeet and Whisper-small--on the Ohio Child Speech Corpus (303 children, ages 4-9), using the reference human transcripts provided with the dataset. Audio was standardized to each models expected sampling rate. No model fine-tuning or adaptation was performed. For each utterance, we computed word error rate (WER) and character error rate (CER), and assessed semantic fidelity using Sentence Movers Distance (SMD) and BERTScore F1. Metrics were summarized overall, stratified by single-year age bins (4, 5, 6, 7, 8, 9), and also grouped into two broader categories: younger children (ages 4-6) and older children (ages 7-9). We compared WER, CER, SMD, and BERTScore F1 across both age groups and evaluated age effects as trends using nonparametric statistical tests. ResultsBoth models showed significant age effects where younger children had markedly higher word error rates (WER >40%) and character error rates (CER >30%) compared to older children (WER [~]30%, CER [~]20%). Sentence mover distance improved with age, while BERTScore F1 remained stable. Despite age-related improvements, overall transcription accuracy was low. DiscussionCurrent commonly used open-source ASR systems are inadequate for pediatric audio transcription, specifically in younger children. In order to build clinically translatable tools, collecting child-specific data and model fine-tuning through structured speech paradigms is essential.